part of the graph.
© John Wiley & Sons, Inc.
FIGURE 17-3: Diagnostic graphs from a regression.
Determining how well the model fits the data
Several calculations in standard regression output indicate how closely the model fits your data:
The residual SE is the average scatter of the observed points from the fitted model. You want them
to be close to the line. As shown in Figure 17-2, the residual SE is about
mmHg.
The multiple r2 value represents the amount of variability in the dependent variable explained by
the model, so you want it to be high. As shown in Figure 17-2, it is 0.52 in this example, indicating
a moderately good fit.
A statistically significant F statistic indicates that the model predicts the outcome significantly
better than the null model. As shown in Figure 17-2, the p value on the F statistic is 0.009, which is
statistically significant at α = 0.05.
Figure 17-4 shows another way to judge how well the model predicts the outcome. It’s a graph of
observed and predicted values of the outcome variable, with a superimposed identity line (
). Your program may offer this observed versus predicted graph, or you can
generate it from the observed and predicted values of the dependent variable. For a perfect prediction
model, the points would lie exactly on the identity line. The correlation coefficient of these points is
the multiple r value for the regression.